The Vocabulary Problem in Spoken Dialogue Systems

نویسنده

  • Susan E. Brennan
چکیده

Designers of spoken dialogue systems need to be able to predict and constrain the words people use in speech directed at these systems. Larger vocabularies lead to longer processing times, as well as a substantial increase in perplexity errors; according to one estimate (Makhoul 1993), error rates increase with the square root of the number of words in the vocabulary (assuming all words are equally likely). The problem is that a speaker of English may know more than 100,000 words.[1] With such abundance in the mental lexicon, the potential for variability in a speakerÕs word choices is enormous. Consider, for example, the abstract geometric object in Figure 1 and the referring expressions for it that were spontaneously produced in thirteen different conversations. This object is not lexicalizedÐit has no conventional labelÐand in virtually every conversation, speakers took quite a different perspective on it. The mapping of terms to referents is many-to-one, especially when a domain is unfamiliar or when alternative conceptualizations are available. This potential for variability in word choice has been dubbed the vocabulary problem by Furnas, Landauer, Gomez, and Dumais (1983 1987).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic processing of out-of-vocabulary words in a spoken dialogue system

One of the most important causes of failure in spoken dialogue systems is usually neglected: the problem of words that are not covered by the system’s vocabulary (out-of-vocabulary or OOV words). In this paper a methodology is described for the detection, classification and processing of OOV words in an automatic train timetable information system [2]. The various extensions that had to be effe...

متن کامل

Cover page A Methodology for Diagnostic Evaluation of Spoken Human-Machine Dialogue Running head: Diagnostic Evaluation of Dialogue

Diagnostic evaluation is an important instrument for the development of high quality spoken language dialogue systems. Yet no rigorous methodology exists for the systematic and exhaustive diagnostic evaluation of all aspects of spoken language interaction: recognition, synthesis, grammar , vocabulary, dialogue etc. The paper addresses part of this problem by presenting a methodology for the dia...

متن کامل

A dynamic vocabulary spoken dialogue interface

Mixed-initiative spoken dialogue systems today generally allow users to query with a fixed vocabulary and grammar that is determined prior to run-time. This paper presents a spoken dialogue interface enhanced with a dynamic vocabulary capability. One or more word classes can be made dynamic in the speech recognizer and natural language (NL) grammar so that a context-specific vocabulary subset c...

متن کامل

Implementing Modular Dialogue Systems: A Case of Study

This paper presents a study of the main modules of the UAH spoken dialogue system, designed to provide information in an academic environment. The novel feature in the implementation of this system is the use of a new module to automatically create speech recognition grammars with vocabulary extracted from a database.

متن کامل

Semantic Processing of out - of - Vocabulary Words in Aspoken Dialogue

One of the most important causes of failure in spoken dialogue systems is usually neglected: the problem of words that are not covered by the system's vocabulary (out-of-vocabulary or OOV words). In this paper a methodology is described for the detection, classii-cation and processing of OOV words in an automatic train timetable information system 2]. The various extensions that had to be eeect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998